YAITSASUG

Yet Another Image Transformation Service At Scale Using Golang

dani(dot)caba at gmail(dot)com

Who are you?

dcProfile := map[string]string{
  "name": "Daniel Caballero",
  "title": "Staff Devops Engineer",
  "mail": "dani(dot)caba at gmail(dot)com",
  "company": &SchibstedPT,
  "previously_at": []company{&NTTEurope, &Semantix, &Oracle},
  "linkedin": http.Get("https://www.linkedin.com/in/danicaba"),
  "extra": "Gestión DevOps de Arquitecturas IT@LaSalle",
}

So... I work

... I (some kinda) teach

... I (try to) program...

... I (would like to) rock...

... and I live

So... I value my time (a lot)

And I really don't like to waste it resolving incidents

Schibsgrñvahed..WHAT??

What is Schibsted?

Origin - Media houses

Marketplaces global expansion

Large group of companies

And SPT?

And SPT Platform Services?

It's about a developer experience...

{
    "format": "jpg",
    "watermark": {
        "location": "north",
        "margin": "20px",
        "dimension": "20%"
    },
    "actions": [
        {
            "resize": {
                "width": 300,
                "fit": {
                    "type": "clip"
                }
            }
        }
    ],
    "quality": 90
}

The journey

2+1/2 YEARS AGO

Firsts onboardings

Onboarding pipelines

Firsts nightmares

New Architecture

New Core

Self service capabilities

Updated onboarding pipelines

Current usage

(Your?) thoughts so far...

Why building your own service?

But there's already opensource http servers for that, right?

Why not offline transformations?

Why microservices?

+

  • Quicker releases
  • APIGW helps to delegate common functionality
    • But business agnostic ones
  • Reusability of individual microservices
  • Each microservice can chose different techs
    • We will focus in delivery-images, in Golang
  • Easier to scale with the organization/development team
    • Not taking advantage
  • and... fun

-

  • S2S communication overhead
  • Extra costs
  • More tooling required (logging, tracing...)
  • Reproduce the complete environment becomes tricky
  • Always caring about coupled services...

Why not CDN/edge transformations?

  • Resize functionality and format conversion is normally covered...
    • But not all our functionality (watermarking?)
  • It may mean duplicated processing
  • Not easy to pack something like libvips as lambdas
  • No single & global CDN in Schibsted

Not a new story... why not presenting it before?

Why transformations in golang?

Transformation library

  • imageflow was not production ready two years ago, with clear gaps on functionalities and bindings

Choosing the programming language

Platform (development) properties

IaC

  • Most of the services in AWS...
  • Generating Cloudformations from python troposphere
  • Managing Cloudformation deployments with Sceptre
  • New projects with infrastructure definition in the same repo than the service code
    • Trying to extend CD to Infrastructure
  • We have assessed AWS GoFormation
    • But still lacks some functionality, like GetAtt or Ref

Code reviews

  • Raffle:
reviewersRaffle:
  strategies:
    - team-with-knowledge-candidates:
        size: 1
        type: knowledge
        participants:
          teams:
            - spt-infrastructure/edge-team
    - team-random-candidates:
        type: sequential
        size: 2
        participants:
          teams:
            - spt-infrastructure/edge-team
  dailyReminder: enabled
slack:
  - "#spt-edge-prs"

Other bots

Continuous delivery

  • travis
  • fpm
  • hardened images
  • spinnaker

Stress testing

Configuration management

Archaius | Viper |

Local fork given lack of defaults support

Logs

Monitoring and alerting

And escalations using pagerduty

Real time monitoring

Distributed tracing

S2S resiliency

Secrets management

Vulnerability scans

delivery-images implementation details

HTTP router

Services communication

  • fargo
  • eureka
  • load balancing
  • hystrix

bi-image + libvips

Caching

Zipkin

Custom metrics

Gifs..

And rate limits..

go-kit/negron logrus logging

Integration tests execution

Datastore access

Graceful shutdowns

Autoscaling

AWS SDK usage

Stresstests

  • Locust -> Vegeta

Multiregion

Smoke tests

More elasticity to reduce costs

Extra compression

Currently jpg-turbo

Bringing the service closer to the business

More engines

Also in use by attachments and CVs. PDF conv makes sense. Bench already done Video

Actual transformation pipelines

Include current workflow

More adoption?

Better capacity management

Incomming queue and reusing cache if no capacity Better degradation but efficient ASG triggers

ApiGW replacement?

Zuul could be replaced by Krakend

Simulating dependencies failures

Why not terraform?

Why not docker/k8s?

  • Portal
  • Migration exercise
  • Local tests

gRPC?

Why not Service Meshes?

Why not Google Cloud?

And Cassandra?

And PaaS?

And prometheus?

Before closing...

Are you going to opensource it?

Price calculator

  • Share price comparison

Other projects

Choosing the right regions

Classifier end to end tests

Corollary

Keep Rx in the code...

Great thanks...

  • Sch*
  • Edge colleagues

Other Qs?